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i— i, Abstract 

H 

' A class of examples concerning the relationship of linear regression and max- 

imal correlation is provided. More precisely, these examples show that if two 
random variables have (strictly) linear regression on each other, then their 
maximal correlation is not necessarily equal to their (absolute) correlation. 
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1 Maximal correlation and linear regression 

V£j \ Let (X, Y) be a bivariate random vector such that its Pearson correlation coefficient, 

P(X,Y): = - , (1) 

v /Var(X) v /Var(r) 



is well defined. If W is a non-degenerate random variable then L|(W) is defined to be 
the class of measurable functions g : H — > R such that < Var[<7(VU)] < oo. Under 
the present notation, the maximal correlation coefficient is defined as (Gebelein, 
1941; Hirschfeld, 1935) 

R(X,Y):= sup p{ 9l {X),g 2 (Y)). (2) 

9l eL*(X), g 2 eL*(Y) 

Due to results of Sarmanov (1958a, 1958b), it was believed for some time that if 
both X and Y have linear regression on each other, i.e., if for some constants do, fli, 
bo, h, 

E(X\Y) = a 1 Y + a {a..B.), E(Y\X) = b t X + b (a.s.), (3) 

then 

R(X,Y) = \p(X,Y)\. (4) 
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The implication (j3J)=>dl|) was cited in a number of subsequent works related to maxi- 
mal correlation of order statistics and records, including Rohatgi and Szekely (1992), 
Arnold, Balakrishnan and Nagaraja (1998, p. 101), Szekely and Gupta (1998), David 
and Nagaraja (2003, p. 74), Ahsanullah (2004, p. 23) and Barakat (2012). However, 
as we shall show below, this implication is not valid even in the case of a strictly 
linear regression, a x bi ^ 0. Note that if R(X, Y) > then the converse implication, 
dD^©, is valid; see Renyi (1959, p. 447) and Dembo, Kagan and Shepp (2001). 
Examples of uncorrelated random variables X, Y with (trivial) linear regression 

E(X\Y) = E(Y\X) = (a.s.) (5) 

and R(X, Y) > = \p(X, Y) \ are known for a long time. For instance, P. Bartfai has 
calculated R(X, Y) = 1/3 for a uniform in the interior of the unit disc. This result 
was extended by P. Csaki and J. Fischer for the uniform distribution in the domain 
\x\ p + \y\ p < 1 {p > 0), in which case R(X, Y) = (p + l)" 1 ; see Renyi (1959, p. 447) 
and Csaki and Fischer (1963). Furthermore, Szekely and Mori (1985) extended this 
result to the multivariate case and with different exponents. Moreover, in response to 
a question asked by Sid Browne of Columbia University, Dembo, Kagan and Shepp 
(2001) constructed a pair (X,Y) satisfying (jHJ) and R(X,Y) = 1. (Observe that the 
same is true for the uniform distribution in the four-point domain {(0, ±1), (±1, 0)}.) 
Using characterizations of Vershik (1964) and Eaton (1986), they also showed that for 
any non-Gaussian spherically symmetric random vector (Ui, . . . , Uk), with covariance 
matrix of rank > 2, there exists a pair of uncorrelated linear forms, 

X = a x Ui + ■■■ + a k U kj Y = b 1 U 1 + ■■■ + b k U k , 

such that © is fulfilled and R{X, Y) > \p(X, Y)\=0. 

However, in the author's opinion, it is important to definitely know that (J3j) 
does not imply (j4j) even in the non-trivial linear regression case. Indeed, if this 
implication were valid in the particular case where a\bi ^ 0, then several works 
concerning characterizations of distributions through maximal correlation of order 
statistics and records - including the papers by Terrell (1983), Szekely and Mori 
(1985), Nevzorov (1992), Lopez-Blazquez and Castano-Martmez (2006), Castano- 
Martinez, Lopez-Blazquez and Salamanca-Mino (2007), Papadatos and Xifara (2012) 
- would be reduced to trivial consequences of this implication. The same is true for 
the main result in Dembo, Kagan and Shepp (2001), since it is easily checked that 
for the partial sums S k = X± + ■ ■ ■ + X k , based on an iid sequence with mean \x and 
finite non-zero variance, 

ii 

^>(S n+m \S n ) = S n + mfi (a.s.), E(S n \S n+m ) = — ■ S n+m (a.s.). 

n + m 

The purpose of the present note is to present a quite general class of random 
vectors (X,Y), with X and Y possessing strictly linear regression on each other, 
and such that R(X,Y) > \p(X,Y)\ > 0. This class is elementary and it is defined 
in the next section. 
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2 Counterexamples 

Let fi and f 2 be two univariate probability densities (with respect to Lebesgue 
measure on R) with bounded supports, supp(/j) C — oo < a, < < oo (i = 

1,2). It is well known that there exists a uniquely defined orthonormal polynomial 
system {(f> n (x)}'^ =0 , standardized by lead(0 n ) := p n > 0, where lead(0 n ) denotes 
the principal coefficient of <j) n . Also, there exists a uniquely defined orthonormal 
polynomial system {ip n (x)} ( ^ =0 , standardized by lead(^ n ) := q n > 0. Each system is 
complete in the corresponding L 2 -space, since for any real t, 

/oo roo 
e tx fi(x)dx < oo and / e ty f 2 (y)dy < oo; 
-oo J — oo 

see, e.g., Berg and Christensen (1981) or Afendras, Papadatos and Papathanasiou 
(2011). Since every polynomial is uniformly bounded in any finite interval, we can 
find constants c n , d n such that 

1< sup \4> n (x)\ = c n < oo, 1< sup \if) n (y)\ = d n < oo, n = l,2,.... 

ai<x<u>i ct2<y<u>2 

Consider an arbitrary real sequence {p n }^ =1 such that 

oo 

^ \pn\c n d n < 1, (6) 

n=l 

e.g., p n = Q(-K 2 n 2 c n d n )~ l (n = 1,2,...) or p„ = \n (n = 1, . . . , N) and p n = 0, 
otherwise, where < A < (J2n=i nc nd n )~~ l - Then, the function 

ffay) ■= fi(x)h(y) fi + Yl Pn ^ n ( x ^ n ( y ^J 5 G i a u u i] x ["2,^2], (7) 

and / := outside [«i, Wi] x [a 2 , W2], is a bivariate probability density with marginal 
densities fi, / 2 ; this is so because, due to (jSJ), the series in (jTJ) converges, for each 
(x, y) in the domain of definition, to a value greater than or equal to —1. (Actually, 
the series converges uniformly and absolutely in [oti, o>i] x [«2, ^2]-) Therefore, f(x, y) 
is nonnegative. Next, it is easily checked that its integral over R 2 equals 1, due to the 
orthonormality of the polynomials. Finally, it is obvious that the marginal densities 
of / are f u f 2 . 

Assume now that the random vector (X, Y) has density /. Then X has density 
fi and Y has density / 2 . Moreover, versions of the conditional densities are given by 

fx\v{x\y) = fi(x) ^1 + ^2p n 4> n (x)ip n (y) \ , ax < x < ui (for each y e supp(/ 2 )), 
fr\x(y\x) = f 2 (y) 1 + ^2 Pn4>n(x)ij n (y) , a 2 <y <uj 2 (for each x e supp(/i)). 



n=l 
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Due to the ortho-normality of the polynomials it follows that for all n > 1, 

E(<f) n (X)\Y) = PnMY) (a.s.), n^n{Y)\X) = PrMX) (a.s.)- (8) 

Clearly, if pi 7^ 0, (JSJ) shows that X and K have strictly linear regression on each 
other. In fact, it is easily checked, using (jSJ) and induction on n, that 

E(X n \Y) = ^Y n + P n ^(Y) (a.s.), E{Y n \X) = ^X n + Q n ^(X) (a.s.), (9) 
Pn q n 

where P n _i(t) and Q n -i{t) are polynomials of degree at most n — 1 in t. Using (J9j) 
and the main result of Papadatos and Xifara (2012), or directly from (jSj), it is a 
simple matter to conclude that R(X,Y) = sup n>1 \p n \. Since the choice of {p n }^i 
is quite arbitrary (see (JSJ)), it follows that 

R(X,Y) > \p(X,Y)\ = \pi\ > whenever < |pi| < sup \p n \. 

n>2 

Remark, (a) It is obvious that the construction (J7J) can be adapted to the discrete 
(lattice) case where (X,Y) 6 {1, . . . , N} 2 , covering the characterizations (for finite 
populations) treated by Lopez-Blazquez and Castano-Martmez (2006) and Castano- 
Martinez, Lopez-Blazquez and Salamanca-Mino (2007). 

(b) Distributions with densities of the form (J7J) are known as Lancaster distributions; 
see, e.g., Koudou (1998) or Diaconis and Griffiths (2012). They can be viewed as 
extensions of the Sarmanov-type distribution (p n = for n > 2) which, assuming 
standard uniform marginals, generalizes the so called Farlie-Gumbel-Morngestern 
family. 
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